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About This Book 



Prerequisites 



This book provides information that you need to know in order to port programs 
from other operating systems to the RISC/os operating system. 



This book assumes you are porting a program from another machine to a MIPS 
RISComputer. You should understand both RISC/os and the operating system 
that you are porting from. You should also be fluent in the programming lan- 
guage you're using and be comfortable using UNIX system tools to write your 
programs. 



What Does This Book Cover? 

This book has these chapters: 



Chapter 1 — Overview. Describes a process to follow when you 
are porting a program to the MIPS computing environment. 

Chapter 2 — Trouble Shooting. A composite of information 
taken from all of the chapters in this manual. It contains tables 
that summarize many problems, and their solutions, that you 
may encounter when porting a prlgram. 

Chapter 3 — RISC/os Considerations. Describes operating sys- 
tem dependencies that you need to be aware of when you are 
porting a C program from another operating system to a MIPS 
system. 

Chapter 4— Hardware Related Considerations. Discusses 
specific MIPS RISComputer implementations that you must con- 
sider when porting a program. 

Chapter 5— Undefined Language Elements. Explains how the 
RISCompiler System defines certain constructions in C, Pascal, 
PL/I, and other high-level languages that may differ in the pro- 
gram that you are porting. 

Chapters 6—10 — Contains information specific to C, FOR- 
TRAN, Pascal, COBOL, RISCwindows, and PL/I programming 
languages. 

Chapter 10 — Programming Tools. Discusses considerations 
for debugging, compiling, and link editing your programs. 
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About This Book 



For More Information 



As you begin to port programs to our system, you may also need to refer to these 
books: 



Book 



Order Number 



Assembly Language Programmer's Guide 

Language Programmer's Guide 

MIPS-COBOL Programmer's Guide and Language Reference 

MIPS-FORTRAN Programmer's Guide and Language Reference 

MIPS-PL/I Programmer' s Guide and Language Reference 

MIPS R2000 RISC Architecture 

RISClos Programmer' s Reference Manual 

RISC I os User's Reference Manual 

RISCwindows Reference Volumes I, II, and III 



3201DOC 

3100DOC 

3105DOC. 

3103DOC 

3107DOC 
3113DOC 

3203DOC 

3204DOC 

3130DOC 



While porting a program, you will need to refer to the MIPS Release Notes that 
accompanied your RISC/os software. You should also have the publications 
listed below available for reference: 

ANSI Pascal 

ANSI Fortran 

ANSI PL/I 

Cobol Standard 

IEEE 754-1985 (floating point) 
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Overview 



The purpose of this chapter is to describe a process to follow when you are port- 
ing programs to the MIPS RISComputer environment. 



1 A Assembly Language Programs 



This manual is a compilation of information that you need to port a C, Pascal, 
FORTRAN, or PL/I program to a MIPS RISComputer. This manual does not 
describe how to port assembly language programs. If you are thinking about 
porting an assembly program and your only reason for coding in assembly lan- 
guage is speed, seriously consider re-coding in a high-level language. Because 
it produces highly efficient machine language code for all supported high-level 
languages, the MIPS RISCompiler reduces the need for assembly language pro- 
gramming. 



1 .2 Making a Program Portable 



You can make the task of porting programs easier by following the guidelines 
listed below when you create your program. 

• Avoid breaking the rules of the source language and avoid using 
its non-standard features j: * 

• Avoid using source language that is machine dependent 

• Avoid relying on anything that is operating system dependent 

• When you have to do any of the things listed above, encapsulate 
them in modules marked " system dependent", use conditional 
compilation #ifdef statements around them, and explain the situ- 
ation with plenty of comments 

1 .3 Information You Need to Know 

Before you undertake a porting project, look over the following check list 
Where applicable, cross-references are given for detailed information. Make 
sure that you are aware of each topic listed below, as you'll save time when port- 
ing your program. 

m All programming languages: 

How MIPS Makefiles work (Chapter 1) 
What is the -G value (Chapter 10) 
Floating Point differences (Chapter 4) 
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Differences in the address space organization (Chapter 4) 
How the ##a<?/conditional works (Chapter 6) 

m C programming language: 

How to use the lint program (Section 6.2) 

How to use variable arguments (Section 6.8) 

How to determine which #defines are appropriate (Section 
6.1) 

Characters unsigned by default (Section 6.4) 

FT! Pascal programming language: 

About the precision of type "real" (Section 43) 

How MIPS RISComputer's single compilation process dif- 
fers from yours 

How memory allocation works (Section 12) 

No two Pascals are the same, because the language has never 
been adequately standardized and the basic Pascal package is 
limited without adding extensions. 

HP FORTRAN programming language: 

How static variables differ from automatic variables (Chap- 
ter 8) 

If the size of your machine's double precision differs from MIPS 
RISComputer 

FTI PL/I programming language: 

The differences between the full ANSI PL/I and ANSI sub- 
set GPL/I 

How MIPS RISComputer handles the nil value for PL/I 
(Section 5.1) 

m COBOL programming language (for information on COBOL, see 
MlPS-COBOL Programmer's Guide and Language Reference): 

Usage Optimization 
Packed Decimal Representation 
Calling Separately Linked Routines 
Sequential Files 
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1 .4 Porting a Program 



The following figure shows the major steps involved in porting an application to 
a RISComputer. 




Finalize the project 



The following sections describe each of the above steps in detail. 



1.4.1 Build/Modify Makefiles 



Large applications are accompanied by UNIX Makefiles, which contain the com- 
mands that build an executable program and which are processed by the the 
RISC/os make facility. This facility provides a method for maintaining up-to- 
date versions of programs that consist of a number of files that may be generated 
in a variety of ways. The Makefile is the description file through which the 
make(l) command keeps track of the commands that create files and the rela- 
tionship between files. Whenever a change is made in any of the files that make 
up a program, the make command creates the finished program by recompiling 
only those portions directly or indirectly affected by the change. 

For more information on Makefiles, refer to the make(l) manual page and 
Chapter 13 in the Languages Programmer's Guide. 
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Before you can build your application, you need to modify the following parts of 
your Makefile: 

• Include the path names for your source program and the include 
files that it uses. 

• Include the path names for the link libraries. See intro(3) for a 
list of MIPS supported libraries. 

• Include a path name for the directory where the application is to 
be accessed if the Makefile has an install target 

• Add compilation and link editor flags. See cc(l), f77(l), pc(l), 
ld(l),cobol(l),andpl/I(l). 

If a Makefile does not exist for the application, and you need to create one, then 
refer to the MIPS Makefile standard in Appendix B of this book. 

1.4.2 Build Executables 

To obtain a debugging version of your program that is not stripped or optimized: 

1 . Build the executable programs for the application by running the 
Makefile to compile and link the programs. . Obtaining the cor- 
rect driver option settings may involve modifying the Makefile 
several times. 

2. Compile using the -g debugger option for full symbolic debug- 
ging. 

If for some reason you wish to run your application before stripping the symbol 
table information and optimization, you should skip the above two steps, and 
wait until the step outlined in 1.4.4 below. 



1.4.3 Run the Application 



When you run your application, more errors may occur. The trouble shooting 
guide in Chapter 2 should help solve some of the run-time errors that you re- 
ceive. If your program still has errors, then use debugging tools such as dbx(l) 
and nm(l). Also, refer to chapters 3-5, and chapters 6-10, as applicable, for ad- 
ditional information on possible causes of errors. 

Commercial applications usually include a test suite. This is the time to run the 
tests. If no tests are provided, then write and automate your own tests with shell 
scripts sh(l) or csh(l). 

1.4 .4 Optimize Performance 

Once your application is debugged, tested, and working, then you should recom- 
pile the final version with the proper optimization level and link edit flags. For 
information on optimization, see Chapter 10 in this manual, and Chapter 4 in 
the Language Programmer's Guide. 
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All tests should be rerun at this point to further reduce the chance that machine, 
language, or operating system dependencies have not slipped through. If the ap- 
plication fails unexpectedly at this point, then you probably still have machine or 
language dependencies that the optimizer did not detect. If this is the case, then 
isolate the area causing the problem and repair it 

Refer to Chapter 4 of the Language Programmer's Guide for a description of 
optimization techniques for your application. You may wish to profile your code 
(see prof(l) and pixie(l)) to see where additional performance gains can be 
made. Another useful tuning tool is cord(l), which helps you improve cache 
performance. 



1.4.5 Install the Application 



Once your application is tested and optimized, install it in the proper directory. 
You can install the program either by hand, or with the Makefile install target. 



1.4.6 Finalize the Application 



The last step is probably the most tedious, but also the most important. If the 
user interface has changed at all, it is important to document the changes, not 
only in the code itself, but in the documentation. This is also the time to finalize 
the source code control system to manage the code. See sccs(l) and rcs(l). 



# * 
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Trouble Shooting 



This chapter is a composite of information taken from all of the chapters in this 
manual. It contains tables that summarize many problems, and their solutions, 
that you may encounter when porting a program. 

Each table contains five columns, each of which contains a characteristic of the 
error, each column heading and a description of the information it contains is 
given below: 

Column 1 * When or how did the error manifest itself 

Column 2 ** The source language(s) of the program most likely 
to create the error 

Column 3 Symptom (general description of the error) 

Column 4 Possible problem source 

Column 5 Action 

(Recommended action to correct the error. 

Often, this is a cross-reference to another section in 

this manual.) 
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3 
RISC/os 4.0 Considerations 



This chapter briefly describes RISC/os 4.0 and describes operating system de- 
pendencies that you need to be aware of when you are porting an application pro- 
gram from: 

• BSD 4.3 to RISC/os BSD mode 

• BSD 4.3 to RISC/os System V mode 

• SystemV to RISC/os BSD mode 

• System V to RISC/os SystemV mode 

3.1 RISC/os 4 .0 

RISC/os 4.0 is an AT&T System V 3.0-based kernel with BSD enhancements, 
including all BSD 4.3 system calls, BSD 4.3 library functions, most BSD 4.3 
commands, TCP/IP networldng, the NFS remote file system, and the Berkeley 
Fast File System. 

RISC/os is packaged so that a user can concurrently access either BSD or System 
V commands. However, programmers are restricted to programming in either 
BSD or System V environments; mixed mode programming is not fully sup- 
ported. UMIPS 3.0 permitted System Viprograms to use some BSD system calls. 
These system calls, as noted in Section 3.3.1, plus a few more are still available 
in RISC/os 4.0 for System V programmers. 

3.2 Porting from BSD-Derived Systems 

3.2.1 Compiling BSD Programs Under RISC/os 4.0 

By default, RISC/os 4.0 is set up to compile programs under System V . In order 
to successfully compile programs for BSD functionality, you must do one of two 
things: 

1. Use the compile time switch systype bsd43 which prepends Ihsd43 to your 

normal path. 

% cc -systype bsd43 -g -o sample sample. c 

2. Place Ibsd43lbin before /bin in the PATH variable in your xshrc, .profile, or 
.login file. When you compile, your system goes to the /bsd43 command 
directory and uses the BSD cc command which contains the switch systype 
bsd43. 



MIPS RISCompiler Porting Guide 3-1 



Chapters ^ 

If however, you want to compile a BSD program for in System V functional- 
ity and you have placed Ihsd43 in your path prior to I bin, you must use the 
-systype sysv switch. As in: 

% cc -systype sysv -g -o sample sample. c 

The default compile time switch for Ibinlcc is -systype sysv and the default 
compile time switch for Ibsd43lbinlcc is -systype bsd43. 

3.2.2 Porting from 4.3 BSD to RISC/os (BSD based) 

Several areas must be considered when converting a program from a regular 
BSD system to a RISC/os BSD system. 

Include files Though textually different, the 4.3 BSD compilation environment 
include files are functionally equivalent to the 4.3 BSD include files. 

The only differences between the two are in areas where the system cannot be 
made compatible. For example, the file letclutmp does not contain the field 
utjiost and the include file that describes the utmp file (Ihsd43lusrlincludel 
utmp.h) contains a special marker that causes any code using the utjiost file to 
not compile. Such code must be changed. 

Libraries All standard 4.3 BSD libraries are provided. However in some cases, 
the libraries use the corresponding System V code. For example, the libc rou- 
tines that get password and group file entries have been copied from System V. 

In the case of curses, the System V.3 curses (based on terminfo) is used. Except 
where the programs try to use the value of the buffer returned by the tgetent() 
function, this version of curses provides the entire 4.3 BSD interface. 

Termcap Only terminfo is supported. In general, programs that use termcap 
and/or curses work as expected. 

The features of termcap that are missing are: 

• The ability to modify the termcap on the fly. This is often done 
to set the terminal size. This can be done by setting the window 
size with winsize(l) or by setting the environment variables 
LINES and COLUMNS. The former is the preferred method. 

• The ability to add new capabilities to the database. This cannot 
be emulated without changing the code. 

IOCTL commands Virtually all 4.3 BSD IOCTLs are supported on RISC/os 
when using the 4.3 BSD Compilation Environment The only time you need to 
make any changes to your code is if your program does extensive tty manipula- 
tion. If this is the case, you should convert the tty handling to System V. For 
more information, see termio(7). 

Command functionality. If a program "exec"s a BSD command, then you 
should verify that the command exists as a BSD command; that is, it can be 
found in Ibsd43lbin and the "exec" command path should be changed accord- 
ingly. Otherwise, you should make sure that System V functionality is sufficient. 
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For a complete description of RISC/os system functionality, please see the 4.0 
Release Notes and check with your system administrator to verify that the BSD 
subpackage has been installed on your system. 

3.2.3 Porting from 4.3 BSD to RISC/ os (SysV based) 

Many programs written in C can be ported from BSD systems without changing 
any code by specifying compiler and link editor driver options as follows: 

• During the compilation step, use the -I/usr/include/bsd and 
-signed options. The -I/usr/include/bsd option causes include 
files to be searched for in lusrlincludelbsd before I usrl include, so 
that BSD values will take precedence, and the -signed option 
causes all char-typed data to be signed. 

• During the link editor step, use the -4sun, -Ibsd, and -Irpcsvc 
driver options to link in routines that are not part of the standard 
C library for System V, but which are needed by BSD and NFS 
programs. 

Note: Use only the -lsun and -Irpcsvc driver option for programs requiring RPC 
and/or XDR. 

3.2.4.1 Differences Between RISC/os SysV and BSD 

This section describes some of the differences between RISC/os 4.0 and 4.3 BSD 

UNIX. 

math h Programs that use lihm should be modified to include the library math.h. 
You cannot assume that the type of these functions under RISC/os 4.0 is the 
same as on other systems. jf ' 

Iongjmp() If your program calls the function longjrnpO from the signal handlers, 
it may need special work before being optimized with -02. This is because 
global variables may be placed in the registers and the values may not be restored 
properly. You may either explicitly declare the appropriate variable as volatile or 
use the -volatile compile option. Keep in mind that using this option signifi- 
cantly reduces the amount of optimization that is done. For a complete descrip- 
tion of the -volatile option, see the Language Programmer's Guide. 

Variable Arguments The typical mechanism for passing variable argument 
lists on BSD systems is to assume that a parameter is a pointer to an array of 
pointers; this does not woik on MIPS machines. Instead you must use the 
vararg.h or stdarg.h macros. For a description of these macros, see Appendix A 
of the Language Programmer's Guide. 

System Administration Files System administration files, such as /etc/ 
passwd, letclinetd.conf, and the utmp file may differ from some applications. 
Some other files, such as /etc/ttys, may be missing. 

Dereferencing a Pointer Address is an invalid address. On many BSD sys- 
tems, this address is addressable; C programs may depend on being able to 
dereference pointers with this address. Dereferencing a pointer with a value of 
is incorrect according to all C standards. 
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Pseudo-4tys Pseudo-ttys use a "clone device*' instead of having pairs of pty/ 
tty. 

COFF Format The MIPS object file format (COFF) is a modified UNIX Sys- 
tem V COFF and differs maricedly from the BSD object file format. Therefore, 
BSD programs that process object file programs as input must be modified so as 
to access information correctly from the MIPS COFF. See Chapter 10 of the 
Assembly Language Programmer's Guide for a description of the format and 
contents of the MIPS COFF. 

tty Interface The tty drive interface does not have a complete emulation. Pro- 
grams that rely heavily on the tty ioctls aie difficult to port. 

Load Average The "avenrun" (load average) kernel symbol contains items of 
type FIX, as defined in syslftxpointX not doubles. 

maliocQ On Apollo systems, theie is a system call to pre-allocate memory for 
mallocf). This call is not supported and is not needed on MIPS machines. For 
additional information on additional malloc() library calls, see Chapter 11 in this 
manual. 

3.3 Porting from System V-Derived Systems 

A program that runs on System V will port more easily to RISC/os if you include 
mathh for math library functions. Do not assume that the type of these functions 
is the same as on other systems. 



libbsd.a 



The library lusrllibl libbsd.a is a System V library provided by MIPS which con- 
tains some 4.3 bsd system calls and library routines. Because of file sizes when 
the library was introduced, the routines in this library have been renamed to ap- 
proximate their 4.3 bsd routine names. For example, the getdomnm in libbsd.a is 
getdomainname in the 4.3 bsd libc.a library. 

Note: All yp routines, such as ypJAid, yp_iirst, yp_match, yp_next, yp_order, 
ypclnt, yppasswd, and so on, have been removed from lihbsd.a because yellow 
pages are not supported for RISC/os 4.0. You must provide your own stubs for 
these routines. Two additional compatibility libraries are also provided, lusrllibl 
librpcsvc.amd/usr/lib/libsun.a 9 for network applications* If you find that a 
source file no longer compiles with libbsd.a, include librpcsvc.a and/or libsun.a 
in the link step. 

Table 3.1 lists the libbsd.a routines, the BSD libc.a names, and gives an explana- 
tion of the routine. 
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libbscia file name 



Table 3 J libbsd.a Routines 

4.3 BSD libaa file name Description 



accept 2-BSD 

bcopy, bcmp, bzero, ffs 3-BSD 

bind 2-BSD 

connect 2-BSD 

dbm_ppen, dbm_close, 
dbrnjetch, dbmjstore, 
dbm__delete, dbmjirstkey, 
dbmjnextkey, dbm__error, 
dbm_clearerr 3 

getdomnm (2-BSD) 



Accepts a connection on a socket 
Bit and byte string operations 
Bind a name to a socket 
Initiate a connection on a socket 
Data base subroutines 



getdomainname, setdomainname 



Get/set name of current domain 



getdtabsz (2-BSD) 
gethostid, sethostid (2-BSD) 

gethostnamadr (3N-BSD) 



getdtablesize 



gethostbyname, gethostbyaddr, 
gethostent, sethostent, endhostent 



Get descriptor table size 

Get/set unique identifier of current host 

Get network host entry 



gethostname, sethostname 



2-BSD 



Get/set name of current host 



getnetent, getnetbya, getnetbynm getnetbyaddr, getnetbyname, # * Get network entry 

setnetent, endnetent 3N-BSD 



getpeernm (2-BSD) 
getprotoe (3N-BSD ) 

getrlimit, setrlimit (2-BSD) 
getrusage (2-BSD) 



getpeername 

getprotoent, getprotobynumber, 
getprotobyname, setprotoent, 
endprotoent 



Get name of connected peer 
Get protocol entry 



Control maximum system resource 
consumption 

Get information about resource utilization 
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Table 3.1 libbsd.a Routines (com.) 

libbsd-a name Hbca name Description 



getservent, getservbyport, 
getservbyname, setservent, 
endservent (3N-BSD) 

getsocknm (2-BSD) 

getsockopt, setsockopt (2-BSD) 

gettimeofday, settimeofday (2-BSD) 
getwd (3-BSD) 

htonl, htons, ntohl, ntohs (3N-BSD ) 
in_addr(3N-BSD) 

injnaof (3N-BSD) 
in_mkaddr(3N-BSD) 

in_netof (3N-BSD) 
injietwork (3N-BSD) 

in_ntoa(3N-BSD) 
insque, remque (3-BSD) 

listen (2-BSD) 

opendir, readdir, telldir, 

seekdir, rewinddir, closedir (3-BSD) 

random, srandom, 
initstate, setstate (3-BSD ) 

rcmd, rresvport, ruserok (3-BSD) 

recv, recvfrom, recvmsg (2-BSD) 

rexec (3-BSD) 
rindex (3-BSD) 

scandir, alphasort (3-BSD) 



getsockname 



ineLaddr 

inetjnaof 
inet_jnkaddr 

inetjnetof 
inetjietwork 

inet ntoa 



Get service entry 

Get socket name 

Get and set options on sockets 

Get/set date and time 

Get current working directory pathname 

Convert values between host and network byte order 
Internet address manipulation routines 

Internet address manipulation routines 
Internet address manipulation routines 

Internet address manipulation routines 
Internet address manipulation routines 

Internet address manipulation routines 
Insert/remove element from a queue 

Listen for connections on a socket 

Directory operations 

Better random number generator; routines for changing 
generators 

Routines for returning a stream to a remote command 

Receive a message from a socket 
Return stream to a remote command 
string operations 

scan a directory 
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Table 3 J lihbsd.a Routines (cont.) 

libbscLa name jjbaa nam© Description 



select (2-BSD) Synchronous I/O multiplexing 

send, sendto, sendmsg (2-BSD) Send a message from a socket 

setregid (2-BSD) Set real and effective group ID 

setreuid (2-BSD) Set real and effective user ID'S 

setuid, seteuid, setruid, Set user and group ID 

setgid, setegid, setrgid (3-BSD) 

shutdown (2-BSD) Shut down part of a full-duplex connection 

socket (2-BSD) Create an endpoint for communication 

syslog, openlog, Control system log 

closelog, setlogmask (3-BSD ) 

3.3.1 RISC/os Differences 

This section describes some differences between RISC/os 4.0 and regular System 

VUNIX. 

Symbolic Links The presence of symbolic links can cause problems if you use 
the UNIX function chdirQ. Programs Jhotild avoid using relative paths to change 
directories, because the sequence 

chdirP./subdir"); chdirP..") 

may not set the directory back to the original place. 

File Name Size The maximum size of a file name is 255 characters, not 14. 
This causes problems when programs expect truncation. For example, a program 
could reject a user-supplied filename that is larger than 14 characters because it 
assumes that 14 characters is the limit. In addition, if a program declares an ar- 
ray to hold a filename, the array may be too small, especially if it is declared to 
be 15 characters long. MAXNAMLEN in lusrlincludeldirent.h and MAXPATH- 
LEN in /usr/ include/ $ys/nami.h are appropriate to use instead. 

Directory Access The MIPS file system does not allow your program to di- 
rectly read a directory. Programs that do this are considered to be non-portable 
to RISC/os even though they may work on many versions of System V UNIX. 

longJmp() Programs that call the function longjmpO from the signal handlers 
may need special work before being optimized with -02, since global variables 
may be placed in the registers and the values may not be restored properly. De- 
claring appropriate variables as volatile solves this problem. To get around this 
problem, use the -volatile compiler option, which causes all objects to be treated 
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as volatile. Keep in mind that using this option significantly reduces the amount 
of optimization that is done. 

COFF Format Programs that work with object and executable files need some 
woric, as the MIPS COFF format is not completely compatible with System V. 

Dereferencing a Pointer Address is an invalid address. On many BSD sys- 
tems, this address contains a 0, and C programs may depend on being able to 
dereference pointers with this value. Dereferencing a pointer with is incorrect 
according to all C standards. 



3.4 Dereferencing nil Pointers 



Programs that contain errors sometimes go undetected on machines where 
dereferencing a zero pointer yields zero. Typically, the programmer meant to 
write: 

int *c; 

if (c != && *c) 

but actually wrote: 

int *c; 

if (*c) ... 

On most VAX UNIX systems, the error goes undetected; on most MC68000 im- 
plementations and on the MIPS RISComputer systems, this causes a segmenta- 
tion violation. 



3.5 Where Text and Data Lie in Memory 



Figure 4. 1 illustrates how text and data are arranged in memory on MIPS ma- 
chines and VAX machines. Refer to this figure as you read the paragraphs that 
follow it. 



VAX 



2G 



stack 



hole 



bss 



data 



text 





MIPS 


2G 


stack 




hole 




data 


256MB 


hole 




text 


4MB 


hole 



Figure 4.1 . A comparison of VAX and MIPS address space 

You may encounter problems when you port a program if you assume that the 
link editor-defined symbol etext indicates the beginning of the data section as 
well as the end of the text section. As illustrated in Figure 4. 1 , etext would work 
on a VAX because data and text are located next to each other. However, on a 
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MIPS machine they are not To solve this problem, the MIPS link editor pro- 
vides other symbols for the beginning and end of each text section; for more in- 
formation see the end(3) manual page. 

Some sophisticated UNIX programs such as GNU's emacs assume that the pro- 
gram text (that is, the executable code) or data starts at a low address in memory. 
The program can for example, store tag information in the high-order bits of a 
pointer and mask out the tag just before dereferencing the pointer. Unless you 
specify otherwise with the link editor -T and -D options, program text on 
RISComputer systems starts at approximately 0x400000, program data approxi- 
mately 0x10000000, and the program stack at approximately 0x80000000. 

3.6 Porting from Other Operating Systems 

This section discusses the issues that you need to be aware of when porting pro- 
grams from systems other than UNIX. 

3.6.1 General 

In general, if you are porting from operating systems other than UNIX, you need 
a working knowledge of both RISC/os and the operating system that you are 
porting from. 

Programs written with no operating system dependencies should port easily. 
Such programs use the standard I/O routines for the language in which they are 
written rather than using UNIX system calls are more likely to port with little or 
no modification. C programs that follow the ANSI C Standard should work as 
expected. 

3.6.2 Porting FORTRAN Programs from VAX , 

Library functions that provide an interface to RISC/os 4.0 (similar to those pro- 
vided by the C library) are available to MIPS-FORTRAN programs. Also, in- 
trinsic subroutine and functions used to interface VAX systems are available to 
provide the same functional interface to RISC/os from MIPS-FORTRAN pro- 
grams. Chapter 4, Part I, of the MIPS-FORTRAN Programmer's Guide and 
Language Reference describes these system functions and subroutines. 
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4 
Hardware-Related Considerations 



This chapter discusses specific MIPS RISComputer implementation that you 
need to consider when porting programs. 

4.1 Floating Point Arithmetic 

In 1985, ANSI/IEEE 754-1985 defined a standard floating point representation 
and arithmetic. MIPS RISComputers conform to this standard. While you might 
expect conformance to this standard to eliminate problems with porting floating 
point programs, there are still significant differences that can hinder an imple- 
mentation's portability. 

Floating point differences manifest themselves by: 

• Producing slightly different results 

• Producing incorrect results 

• Slow execution 

• Faulting 

4,1-1 General IEEE 754 # 

Table 4.1 lists the IEEE floating point format. The explicit use of extended pre- 
cision formats available on some DEEE 754 floating point implementations makes 
programs non-portable, because there is no simple or efficient way to get the 
range or accuracy of IEEE extended on a machine whose highest precision is 
double. To avoid this problem, try using double precision instead; however, us- 
ing double may give you incorrect results. Therefore, before you substitute dou- 
ble for extended, analyze why your program is using extended. 

Some compilers use extended precision even when your program does not spec- 
ify it. This is because some hardware such as Motorola 68881, 68882, and Intel 
80387, makes this the only efficient thing to do. When an expression is evalu- 
ated using extended precision, you may get a slightly different answer than if it 
were evaluated in double precision. 

The implementation of library math functions differs from machine to machine, 
so you will see slightly different results when you run programs on the MIPS 
RISComputer. 
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Table 4.1 IEEE Floating Point Format 


Format 


Size 


Radix 


Approximate Range 


Rounding 


Exceptions 


single format 


32 bit 


2 with 24 bits 
of precision 


10- 45 to 10 38 


round to nearest, 
round .5 to even 


for overflows, divide by zero, 
and so on, do not fault, but 
instead return special symbols 


double format 


64 bit 


2 with 53 bits 
of precision 


1(T 324 to 10 308 


same as above 


same as above 


extended 


80 + bit 


2 with 64 bits 
of precision 


io 4951 * io 4931 


same as above 


same as above 



4.1»2 



DEC VAX 



Table 4.2 lists the DEC VAX floating point format. If you are porting a program 
that uses VAX floating point to a MIPS RISComputer, which uses IEEE floating 
point, keep in mind that the IEEE floating point format is much more likely to 
cause problems between single- and double-precision in loosely typed languages 
like FORTRAN. 

For example, the two VAX single- and double-precision formats are identical 
except that the double-precision format provides additional precision. Therefore, 
if you reference something that is single that is really double, it has little effect on 
the value. Because IEEE 32-bit and 64-bit formats are different, if you make the 
same mistake on a RISComputer, you could produce data that does not even re- 
semble the data produced from the original machine. 

Use of H-format (REAL* 16) is non-portable to RISComputers, because their 
floating point does not have the range or accuracy of H-format. Using IEEE 
double precision will likely give you incorrect answers. 

The default double precision, D-format, has more precision but less exponent 
range than IEEE double, thus precision-sensitive programs may give different 
results. 

Table 42 VMS Floating Point Format 



Format 


Size 


Radix 


Approximate Range 


Rounding 


Exceptions 


F-format 


32 bits 


2 with 24 bits 
of precision 


10" 38 to 10 38 


round to nearest 
round .5 and up 


for overflows, divide by 
zero 


D-format* 


64 bits 


2 with 56 bits 
of precision 


i ~ 38 38 

10 to 10 


same as above 


same as above 


G-format * 


64 bits 


2 with 53 bits 
of precision 


-308 307 

10 to 10 


same as above 


same as above 


H-format 


128 bits 


2 with 112 bits 
of precision 


IO" 4933 * 10 4931 


same as above 


same as above 



* F-format is similar to IEEE single format and G-format is similar to IEEE double format 
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IBM 370 



Table 4.3 lists the IBM 370 floating point formats. Programs that depend on the 
larger single-precision exponent range are non-portable. MIPS RISComputers 
generally provide better accuracy, and therefore different results. 

Table 43 IBM 370 Floating Point Format 



Format 


Size 


Radix 


Approximate Range 


Rounding 


single format 


32 bits 


16 with 6 radix-16 
digits precision 


lO^to 10 75 


chopped 


double format 


64 bits 


16 with 14 radix-16 
digits precision 


lO^to 10 75 


same as above 


Real*16 


128 bits 


16 with 30 radix-16 
digits precision imple- 
mented in software 


l(T 73 to 10 75 


same as above 



4.1.4 



Cray 



Figure 4.4 lists the Cray floating point format. Because Cray's single-precision 
is a 64-bit format, it is generally necessary to switch to MIPS double precision to 
get the same results. Also, if your program depends on a large exponent range or 
128-bit precision, program modifications are required. MIPS RISComputers 
provide better accuracy than Cray's 64— bit format, and therefore different results 
occur. ! P 



Table 4 A Cray Floating Point Format 



Format 


Size 


Radix 


Approximate Range 


Rounding 


single format 


64 bit 


48 bits of preci- 
sion 


-2460 2460 

10 to 10 


chopped or worse 


double format 


128 bit 


95 bits of 
precision imple- 
mented in soft- 
ware 


-2460 2460 

10 to 10 


chopped or worse 



4.3.5 Math Library Accuracy 



Besides basic floating point format and accuracy issues, each implementation 
typically differs in the algorithms and characteristics of its math library, Even 
IEEE 754-1985 machines that are otherwise identical may produce different re- 
sults due to differences in math libraries. See math(3M) for additional informa- 
tion on the MIPS math library. The algorithms are generally from Cody and 
Waite, with some additions and replacements from 4.3 BSD. 
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4.2 Endianness 



Machines that number the bytes from right to left and the least significant byte is 
zero, within a 32-bit integer are called little-endian; machines that number the 
bytes from left to right and the least significant byte is 3 are called big-endian. 
See Appendix D and Byte Ordering options in Chapter 1 of the Language Pro- 
grammer's Guide for more information on these byte ordering schemes. Al- 
though the MIPS R2000 chip can operate either way, MIPS RISComputer sys- 
tems use the big-endian byte ordering scheme. 

You may create porting problems by placing small objects side by side to make a 
bigger object, or splitting a big object into small objects. For example, the fol- 
lowing code that reads and compares a pair of shorts is machine-dependent, be- 
cause on some machines the Oth element of the array represents the high-order 
half of the word rather than the low-order half: 

char carray [BUFSIZ] ; 

err = read(0, carray, 4); 

if ( (carray [0] | (carray [1] « 8)) > 

(carray [2] | (carray [3] « 8))) ... 

There is never a problem if you use the correct data type and let the compiler deal 
with the order of the bytes: 

short sarray [BUFSIZ] ; 

err = read(0, (char *) sarray, 2 * sizeof (short) ) ; 

if (sarray [0] >sarray[l]) ... 

Similarly, the following code to print four characters stored within an integer is 
machiiie-dependent because it assumes the first character is at the low-order end 
of the integer: 

unsigned i; 

printf ("%c%c%c%c\n", i & Oxff, (i »8) & Oxff, (i 

»16) & Oxff, 

(i »24) & Oxff); 

A better solution is: 

unsigned i; 

printf ("%.4s\en", (char.*) &i) ; 
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4.3 Alignment 



RISComputer architecture requires that each piece of data in memoiy be aligned 
on a boundary appropriate to its size. For example, an n-byte integer can be 
aligned on a boundary whose address is a multiple of iv-bytes, up to a maximum 
of 8 bytes. This restriction permits the memory system to run much faster. 

Ordinarily alignment has no effect on correctly written programs, because the 
compiler inserts unused space ("padding") between variables wherever neces- 
sary to conform to the rules. Language standards almost always permit such pad- 
ding, and in the rare cases where the language forbids it, the compiler conforms 
to the language requirements by loading and storing objects in special ways (see 
Section 8.2 in this manual for information on how this applies to FORTRAN 
programs). 

However, a program that follows the rules of its language usually doesn't en- 
counter problems. To avoid alignment problems, declare the fields of a structure 
in descending order by size. 

Even a program that follows the rules given above may have trouble when writ- 
ing data on one machine and reading it on another. In fact, padding is only one 
of many problems: machines differ with regard to endianness, floating-point for- 
mats, the size of the integer, and the width of a character or word. There are two 
collective solutions to these problems: if I/O speed is not important, use ASCII 
files rather than binary ones; otherwise, consider using the xdr(3N) subroutine 
package for external data representation. 

See Section 8*2 in this manual for a discussion of the extensions to the compiler 

system for dealing with misaligned data as they apply to FORTRAN programs. 

.J * 
You may choose one of the following three command-line arguments to deal 
with various degrees of misalignment: 



-align8 



-&lignl6 



Permits objects larger than 8 bits to be aligned on 8-bit 
boundaries. This option requires the greatest amount of 
space; however, it is the most complete solution; 16-bit pad- 
ding is not inserted for integer*2 objects within common 
blocks. 

Pemiits objects larger than 16 bits to be aligned on 16-bit 
boundaries; 16-bit objects must still be aligned on 16-bit 
boundaries (MC6800O-like alignment rules); 16-bit padding 
is not inserted for integer*2 objects within common blocks. 
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-align32 Permits objects larger than 32 bits to be aligned on 32-bit 

boundaries; 16-bit objects must still be aligned on 16-bit 
boundaries, and 32-bit objects must still be aligned on 
32-bit boundaries. This option requires the least amount of 
space, but isn't a complete solution; 16-bit padding is in- 
serted for integer*! objects within common blocks. 



4.4 Uninitialized Variables 



Whenever possible, initialize local variables. The lint(l) C program checker and 
the RISCompilers issue warning messages about uninitialized data in certain in- 
stances. However, because the system can see only the static characteristics of a 
program, it cannot warn about all instances of uninitialized data. 

If your program's failures vary with the input data, but the variances are not logi- 
cally related to the failing code, look for uninitialized variables. 

In addition, if your program works when compiled with the default -01 optimi- 
zation, but fails when compiled with -02 optimization, then the fault may be 
caused by uninitialized variables rather than the optimizer. In an -02 optimiza- 
tion, the optimizer may allocate an uninitialized variable to a register, creating an 
error that would not have occurred in an -Ol optimization. 

On the MIPS RISComputer, uninitialized variables can degrade the performance 
of a program that otherwise runs correctly. The hardware performs most IEEE 
operations, but software is invoked for operations on denormalized numbers. If, 
in performing computations on an uninitialized floating-point variable, an 
uninitialized variable happens to be a denormalized IEEE value, then the algo- 
rithm in your program could continue to function properly even with a non-zero 
variable, provided it remains close to zero. This situation could deteriorate the 
performance and accuracy of your program. 

If you suspect this problem, use the time(l) command. For most programs, the 
system CPU time is small compared to the user CPU time. If the system time is 
unexpectedly high but not high enough to account for the overall slowdown,, 
that's a good indication of denormalized arithmetic. The system time does not 
account for the entire slowdown, because not all of the emulation time is charged 
against your program. 

Another aid in diagnosing these problems is the fpi(3) floating point interrupt 
analyzer. The fpi routines count the instances of floating-point emulation and 
print a summary. 
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Undefined Language Elements 



Language standards deliberately avoid defining certain language constructs, thus 
causing inconsistencies among different implementations of the same language. 
This section explains how the RISCompiler system defines some of these con- 
structs, which you may need to alter in the program being ported. 



5.1 The Value of nil 



C, Pascal, and PL/I do not specify the value that the compiler must use to repre- 
sent a nil (or "null") pointer. However, C does dictate that the compiler must 
recognize a zero in the source program as the notation for a nil pointer and con- 
vert it into whatever value does represent nil. 

The MIPS RISCompiler system uses zero to represent ' 'nil' \ Few UNIX pro- 
grams encounter any difficulty with this, but other operating systems use other 
values like ' *-!' ' or * '- maxint -1". A portable program shouldn't depend on 
this value, but for convenience the MIPS PL/I compiler does recognize an option 
that permits you to change the value: 



pll -Wf , -setnull, -1 

5.2 Order of Evaluation 



# 



The order in which program statements are evaluated can cause problems as 
shown below. 

For example, the expression in the following Pascal statement can cause trouble 
if the programmer hoped that it would invoke the decrement function on both 
variable x and variable y: 

if (decrement (x) < 0) and (decrement (y) < 0) then 



As another example of the side effects, neither language specifies the order in 
which the compiler evaluates an actual argument list: 

foo (decrement (x) , x+ y) ; 
Another example is the C statement: 

foo(*p++, *p++) ; 
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The best way to control the order of evaluation in a program being ported to a 
RISComputer system is to introduce temporary variables. Because of global op- 
timization, this usually costs nothing (apart from forcing the intended order and 
degree of evaluation), because the compiler attempts to allocate all of the objects 
to registers: 

tempi = decrement (x) ; 
temp2 = decrement (y) ; 
if (tempi < and temp2 < 0) then . . . 

tempi = decrement (x) ; 
call foo (tempi, x+ y) ; 



5.3 Inter-Language Interfaces 



The allocation of variables in memory, the rules of argument-passing, and the 
mapping of source-language identifiers onto assembly-level symbols all pose 
problems that appear when you stop programming entirely within one language 
and start calling routines written in another language. 

For example, Pascal specifies that the ord function must return zero for a false 
boolean and one for a true boolean; but Pascal does not specify whether a 
boolean value is stored in memory as a single bit, a byte, or a full word. In fact, 
Pascal permits a compiler to implement true by setting the sign bit of a word, or 
even by setting all bits to 1, provided the ord function performs the appropriate 
conversion. As long as you program entirely in Pascal, you need never know 
these details, but when Pascal code passes a boolean to a C subroutine, the latter 
must know whether to expect a char, a short, or an int, and what value consti- 
tutes true. 

For information on interfaces between C and Pascal programs, see Chapter 4 in 
the Language Programmer's Manual. For information on the interfaces be- 
tween FORTRAN and C programs, and FORTRAN and Pascal programs, see the 
MIPS-FORTRAN Programmer's Guide. 
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MIPS C conforms to the de facto standard established by the Kernighan and 
Ritchie text and the AT&T portable C compiler. It provides certain extensions, 
such as prototype declarations, suggested by the draft ANSI C standard. See Ap- 
pendix A in the Language Programmer's Guide for more information on C ex- 
tensions. 



6.1 Using the C Preprocessor 

To maintain your program on both an old system and on the RIS Computer sys- 
tem, consider using the #ifdef conditional-compilation facility provided by the 
cpp preprocessor. The C and Pascal RISCompilers provide this feature by de- 
fault; the FORTRAN compiler provides it if you either use the -cpp driver op- 
tion, or give your source file a name ending in JF rather than ./. Using cpp, you 
can include the following conditional statements in your program: 

#ifdef MYJ3LD_MACHINE 

x := #ff5a; 
I #endif /* MY_OLD_MACHINE */ 

' #ifdef MIPS 

x := 16#f£5a; 

#endif /* MIPS */ # < 

Then, you can use the -D option to select the appropriate version. For example, 
to generate a MlPS-specific version of a Pascal program, you would specify: 

pc -DMIPS myprog.p -o myprog 

To translate myprog.p into a source file myprog. i suitable for compilation on your 
old machine, you would use the -P option as follows: 

pc -P -DMYjDLD__MACHINE myprog.p 

rep myprog . i my _old_machine : myprog.p 

On most machines, including RISComputeis, the -D option is unnecessary if you 
I use a name that is automatically defined for you. MIPS compiler drivers 

predefine the following automatically: 

\ 

mips 

host__mips 

MIPSEB 

MIPSEL 

LANGUAGE_C 

LANGUAGE_PASCAL 

LANGUAGE_FORTRAN 
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LANGUAGE__AS SEMBLY 

LANGUAGE_PL1 

LANGUAGE_COBOL 

unix 

SYSTYPEJ3SD 

SYSTYPE SYSV 



Note: Typically, you use #ifdefmips for differences that are hardware related or 
os related and MfdefMIPS for differences due to other programs or preferences. 



6.2 Using the Lint Program Checker 



The lint program checker tries to find areas in the source code of C programs that 
are importable or that are likely to cause errors. See the lint(l) manual page in 
the User's Reference Manual for reference information. Here are some guide- 
lines to follow when using lint: 

• Instead of running lint on your source files one by one, run it a sin- 
gle time, specifying the names of all the source files. The lint com- 
mand detects such problems as argument-list mismatches more 
thoroughly when it processes the entire source program at once. 

• Use the same -D and -I options (if any) that you specify when you 
compile. 

• Analyze lint error or warning messages carefully before changing 
your code; make sure you understand why lint is creating the errors. 
For example, suppose lint indicates that a function result is incom- 
patible with its use: 

double d; 

d - atof P1.23"); 

You could satisfy lint by putting a cast in front of the function call: 
d = (double) atof pi .23") ; 

but in fact you would be masking the problem rather than fixing it. 
The correct solutions are to either include mathh in your program 
or declare atof so that the compiler knows that it returns a double 
value rather than an int as follows: 

double d; 

extern double atof(); 

d - atof pi. 23"); 
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6.3 Memory Allocation 

The interface to the C library memory allocator malloc is standard, but the imple- 
mentation varies. RISC/os 4.0 uses the 4.3 BSD malloc, rather than the System 
V.3 malloc, because the former is significantly faster. 

BSD malloc allows for allocation errors in that it rounds up the requested block 
size to a power of two, thus making programs that write more than they allocate 
work. While the power of true allocation is fast, it is inappropriate for large data 
block sizes. 

Note that UNIX memory allocators use more memory than the programs request 
If you plan to allocate memory in large chunks and never free them during execu- 
tion, consider using sbrk(2). 

If you suspect a problem caused by memory allocation, try a different allocator 
and see if the problem disappears or changes. UMIPS provides the following 
memory allocators in addition to the standard malloc version: 

• an optional malloc, which you can obtain by specifying the -Imal- 
loc option during compile/link edit 

• an additional allocator with routines xmalloc, xfree, and xrealloc 
resides in lusri lib! libp. a (in release 1.31 or earlier, specify -Ixmal- 
loc). You can allocate the routines using the -lp option during com- 
pile/link edit. This allocator's interface is identical with that of mal- 
loc, free, and realloc. a < 

Even if using a different memory allocator solves the problem, you should still 
fix it to prevent a recurrence. Here are some approaches you can take: 

1. Replace all calls to malloc and realloc with a wrapper routine that 
initializes the newly-allocated block (or the yet-unused portion of 
the reallocated block) to zero. If the problem disappears, look for 
code that erroneously assumes that newly allocated memory is in- 
itialized to zero. 

2. Replace all calls to malloc and realloc with a wrapper that calls 
those routines, allocating one more byte than you ask for. If the 
problem disappears, this experiment may hide the problem by alter- 
ing the order of blocks in memory. It is also likely that (in Pascal or 
Fortran) the program is confused about whether a character array 
originates at or 1, or that (in Q the program did not leave space 
for the "null" byte that terminates a string. 

3. Replace all calls to malloc and realloc with a wrapper that calls 
those routines, allocating four or eight more bytes than you ask for. 
If the problem disappears, then a zero-origin problem with an inte- 
ger, real, or double-precision array exists. 
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4. Experimentally replace all calls to free with an empty routine. If the 
problem disappears, the experiment may have masked the true prob- 
lem by rearranging blocks in memory. However, dangling pointers 
to reused space may be causing the problem. Make sure that the 
program does not retain pointers to any data structure whose address 
may change due to a call on realloc. 



6.4 Signed chars 



6.5 Bitfields 



Like AT&T 3B compilers, but unlike most VAX and MC68000 compilers, the 
MIPS RISCompiler System interprets char to mean unsigned char. The 
-signed command-line option, however, reverses this. 

To understand the consequences of unsigned characters, consider that the charac- 
ter Ojgffis not the same as -i; and a loop like 

char c; 

for (c = '\ '; c >= 0; c — ) ...; 

never terminates because the variable c can never be negative. 

The MIPS C compiler, and others that have adopted features of the proposed 
ANSI draft standard, permit you to specify either signed char or unsigned char 
explicitly in a declaration. Alternatively, you can use masks or shifting to elimi- 
nate or propagate bits. 

Lint detects such problems by printing the diagnostic message degenerate un- 
signed comparison. 



For a bit field declaration within a structure, the MIPS C compiler uses signed or 
unsigned bitfields depending on your declaration. The Kemighan and Ritchie 
definition of the language permits a compiler to ignore these attributes and al- 
ways use signed arithmetic or always use unsigned arithmetic; some compilers 
take advantage of this. 



6.6 Short, Inf, and Long Variables 



On a RISComputer system, a short variable is 16 bits wide; an int variable is 32 
bits wide; and a long variable is also 32 bits wide. Some microcomputer compil- 
ers allocate only 16 bits for int and 8 bits for short, and some programs may rely 
on this. In general, manipulating 32— bit objects with the RISComputer architec- 
ture is as fast as or faster than manipulating 16-bit objects. 

6.7 Leading "J' 

Like AT&T 3B compilers, and unlike the BSD UNIX VAX compiler, the MIPS 
C compiler system doesn't prepend an underscore to the name of a C-compiled 
symbol. 
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6.8 Varargs 



To improve performance, RISCompilers pass certain procedure arguments in 
registers. This process is normally transparent to you, except for functions that 
use variable-length argument lists. These lists must use the macros provided in 
lusrl include! 'varargs. h or lusrlincludelstdarg.h. The functions must not assume 
that the arguments all appear in memory and can be accessed by taking the ad- 
dress of the first argument and incrementing it. Both the ANSI draft standard 
and the Kemighan and Ritchie definition of the language warn that programs at- 
tempting to implement variable argument lists without using varargs may not be 
portable. 

Even varargs cannot deal with a situation where the argument list varies in type 
as well as length. Consider the following rather common practice of assuming 
that all C data types are equivalent for purposes of parameter-passing: 

error (s, a, b, . c, d, e) 

char *s; 

int a, b, c, d, e; 

{ 

fprintf (stderr, s, a, b, c, &, e); 

} 

double d; 

error ("Value %g should be between %g and %g\en", 

d, 1.2, 6.5); 

M- '' 
The problem with this routine isn't that the variable argument list is variable, but 
rather that the routine declares arguments a through e as integers when in fact the 
routine plans to supply floating point numbers. This violates both the Kemighan 
and Ritchie definition of the language and the ANSI draft standard. In addition, 
it has dire consequences, because RISComputer architecture uses two separate 
sets of registers to pass integer and floating point arguments, and because it im- 
poses rules on the alignment of data types. The fprintf can accept variably typed 
arguments because it determines the types at execution time and references them 
appropriately; but the routine in the above example tells the compiler to emit a 
single version of "error" that always references them all as type int. 
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The following code fragment is the best method to use for a program being 
ported to MIPS-C Any other system that implements fprintf using a routine 
called vprintf can also use this routine system: 

/* VARARGS 1 */ 

void 

error ( s , va__alist ) 

char *s; 

va_dcl 

{ 

va_list ap; 

va__start (ap) ; 

vprintf (s, ap, stderr); 

fputs ("\en", stderr) ; 

exit (1) ; 

} 

error ("Value %g should be between %g and %g\en", 
d, 1.2, 6.5); 

However, a solution using a macro would make this routine more portable: 

♦define error (_s,_a,_b,__c,__d,_e) \e 
fprintf (stderr, __s, _a, __b, _c, \_d, __e) ; \e 
exit (1) ; 

error ("Value %g should be between %g and %g\en", 
d, 1.2, 6.5, 0, 0); 



6.9 typeclef Names 



ANSI C provides prototypes that in one instance conflict with Kernighan and 
Ritchie usage. ANSI C makes it illegal for a typedef name to appear in the argu- 
ment list for a function definition. For example, in the following code: 

typedef int P; 
function (P) ; 
{ 
} 

the occurrence of P in the argument list is illegal since the compiler expects an 
identifier after the type T\ MIPS C conforms to the ANSI standard in this case. 



6.10 Functions Returning Float 



Functions that are declared as returning float actually return float rather that dou- 
ble as in some older implementations of C. If the result is then used in a context 
requiring promotion to double; it is promoted after returning from the function 
call. 
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6.11 Casting 



Casting is not permitted on the left hand side of an assignment. If you ait port- 
ing a program that currently runs on a Sun Workstation, you may have problems 
with this because Sun allows casting on the left hand side. 



6.12 Dollar Sign in identifier Names 



The dollar *$' sign is not a legal character in an identifier name. Because VAX 
and Sun compilers allow you to use '$' as a legal character, MIPS provides the 
command line argument: 

-Wf, -Xdollar 

6.13 Additional Keywords 

MIPs will eventually conform to the ANSI C standard; therefore, the compiler 
treats const, signed, and volatile as keywords. 

6.14 allocaQ 

The current RISCoperating system does not support alloca(). 

6.15 Unsigned Pointers 

MIPS RISCompiler treats pointers as unsigned rather than signed integers. For 
example, the following code: 

extern char * sbrk()J : 

char *p 

p - sbrk(4090) 

if (p < 0) error ("out of memory"); 

does not woiic as expected because MIPS RISCompilers use unsigned pointer 
comparisons, and nothing unsigned is less than zero. The shrk routine does not 
work because it returns -1 if it fails. The proper way to test for failure is: 

if (p == (char *) -1) error ("out of memory"); 
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MIPS Pascal conforms to the IEEE standard, which is similar to the original 
Wirth-Jensen report, rather than to the ISO standard. It also provides a number 
of extensions, but not UCSD string support or ISO conferment arrays. See Ap- 
pendix B in the Language Programmer's Guide for more information on Pascal 
extensions. 

If you wish to maintain your program on both an old system and on the RlSCom- 
puter system, refer to Section 6.1 in this manual. 



7.1 Runtime Checking 



When possible, compile your program with runtime-checking using the -C op- 
tion, which generates code that checks that subscripts don't exceed the range 
specified for them in the program. Storing one byte past the end of an array of 
characters may be harmless on one system if the compiler decides not to use the 
byte for anything but causes an execution error on another system if a compiler 
decides to store something such as a subroutine return address there. 



7.2 Pascal Dynamic Memory Allocation 



The MIPS Pascal compiler responds tfthe much-requested dynamic allocation 
extensions to IEEE Pascal. The compiler provides a new generic data type, 
pointer, which is type-compatible with any standard Pascal pointer type. 

The new capability does not allow you to directly take the address of an arbitrary 
variable or directly dereference a generic pointer. However, you can take the 
address of any object in the Pascal heap, or you can use the C library function 
malloc to return a generic pointer. Once you have a generic pointer containing 
the desired address, you can use any Pascal pointer type as a "template" to 
dereference that pointer. 

Here is an example of one approach that uses malloc: 

(* Declare interface to C library function for dynamic 
allocation *) 

function malloc (number_of__bytes : integer): pointer; ex- 
tern; 

(* Declare interface to C library function for rapidly 
setting a block of memory to a fixed value *) 

procedure memset (destination: pointer; value: char; 
number_of_bytes : integer); extern; 
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(* Two examples: a string, and an array of real numbers 

*) 

type 

big_char__array - packed array [0 
string = record 

length: integer; 

data : *big_char_array ; 

end/ 
big_real_array = packed array [0 
matrix2d = record 

rows, columns: integer; 

data : *big_real__array ; 

end; 



maxint] of char; 



maxint] of real; 



var 

s: string; 
m: matrix2d; 
i, j: integer; 
Begin 

/* To read a string of length "i" from the input: */ 

s. length = i; 

s.data = malloc(i * sizeof (char) ) ; 

if s.data = nil then 

...handle allocation error here... 

for j := to i - 1 do 

begin; 

s.data^ [j] := input A ; 

get ( input ) ; 

end; 

m.rows := 5; 

m. columns := 7; 

m.data := malloc (m.rows * m. columns * 
sizeof (real) ) ; 

if m.data - nil then 

...handle allocation error here... 

(* Clear the array *) 

memset (m.data, chr(0), m.rows * m. columns * 

sizeof (real) ) ; 

for i := to m.rows - 1 do 

for j := to m. columns - 1 do 
m.data 7 " [i * m. columns* j] := 1.0; 
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For reasons explained in Chapter 10 of this manual, you should refrain from us- 
ing the generic-pointer facility with variables which lie in local or global mem- 
ory rather than in the heap or the malloc area. For example, while the following 
trick does permit you to take the address of any character array, it is unsafe when 
used with ordinary local or global variables. 

In one module: 

function char__addr (p: pointer): pointer; 
begin 

char_addr :- p; 
end; 

In other modules : 

function char_addr (var c: char): pointer; extern; 
function mung__strings (p, q: pointer); extern; 
var 

x: packed array [1 .. 10] of char; 

y: packed array [1 ... 100] of char; 

p, q: pointer; 

Begin 

p : = char__addr (x [1] ) ; 
q := char_addr (y [1] ) ; 
mung_strings (p, q) ; 
end 
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This chapter describes MIPS-FORTRAN, which contains full American Na- 
tional Standard (ANSI) Programming Language FORTRAN (X6.9-1978) plus 
MIPS extensions that provide full VMS FORTRAN compatibility to the extent 
possible without the VMS operating system or VAX data representation. MIPS- 
FORTRAN also contains extensions that provide partial compatibility with pro- 
grams written in SVS FORTRAN and FORTRAN 66. 

MIPS-FORTRAN is a superset of VMS FORTRAN; the MIPS compiler system 
can convert source programs written in VMS FORTRAN into machine programs 
executable under the UMBPS operating system. 

See the MIPS-FORTRAN Language Reference and the MIPS-FORTRAN User's 
Guide for more information on language extensions. 

If you wish to maintain your program on both an old system and on the RJSCom- 
puter system, read Section 6,1 in this manual. 

8.1 Static Versus Automatic Allocation 

For fastest program execution, the FORTRAN compiler uses -automatic alloca- 
tion by default. If your program requires static allocation, you could use the 
-static driver option when you compile, however, program execution speed is 
sacrificed in most cases. A better solu|fcn is to use the ANSI FORTRAN 77 
SAVE statement to specify the particular variables that must be statically allo- 
cated to make the program work correctly. 

One symptom of a program that uses -static is repeated program failures because 
uninitialized local variables are used. 

Neither ANSI FORTRAN 66 nor FORTRAN 77 permits a program to assume 
that local variables are automatically initialized to zero, or that local variables 
retain their values from the time a subroutine returns until the next time that sub- 
routine is invoked. 

Many older compilers use static allocation; that is, they allocate a location in 
global memory for each local variable in each subroutine. Because each local 
variable has its own fixed location, it starts out with a value of zero and retains its 
value even when Hie subroutine that declared it is not active. Applications on 
various systems often make use of this inadvertantly. 

Automatic allocation uses a stack to implement local variables. It has several 
advantages. 

First, because current local variables reside near the current stack pointer, the 
compiler can address them with short-offset load and store instructions, which 
execute more rapidly than large-offset instructions. 
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Second, local variables get popped from the stack when a subroutine returns, the 
total memory required for the program is less, and subroutines which are never 
active simultaneously can share memory for their local variables. 

Thini, automatic allocation permits the global optimizer to more effectively allo- 
cate local variables to registers within a subroutine. This is because the optimizer 
does not need to do either of the following: 

• load the initial values of the variables from global memory at the 
start of the subroutine 

• restore their final values to memory when the subroutine returns. 



8.2 Retention of Data 



MIPS-FORTRAN does not support the retention of data passed as parameters in 
previous calls to different entry points of a subroutine. This effect is not allowed 
by the FORTRAN standard and is error prone. However it is supported by some 
FORTRAN implementations and is required by some FORTRAN progrrams. 
Consider this example, your program calls an entry point to a subprogram with 
certain arguments, it then calls the subprogram again to a different entry point or 
the subprogram itself, the second call assumes that the arguments to the first call 
remain valid.MIPS FORTRAN does not support this usage.However, you can do 
one of the following to retain the data: 



$ 



set the arguments to local variables in the subprogram and use 
the -static switch to retain the values of the local variables. 



• place the variable in a global common. 

8.3 Variable Length Argument Lists 



MIPS-FORTRAN does not support variable length argument lists, so your program 
can't call a routine the first time with fifteen arguments and a second time with two argu- 
ments. 



8.4 Runtime Checking 



Compile your program with runtime-checking using the -C option. The -C op- 
tion generates code to check that subscripts do not exceed the range specified in 
the program. Performance is impacted once the program is debugged. Remov- 
ing the -C option solves this problem. Storing one byte past the end of an array 
of characters may be harmless on one system if the compiler decided not to use 
the byte for anything. However, an execution error may occur on another system 
if the compiler tries to store something such as a subroutine address. This does 
not work if array parameters are declared as one element, which is common in 
older programs. To get around this, use the Fortran 77 "*" declaration. 
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8.5 Alignment of Data Types 



RISComputer architecture imposes certain rules governing how data may be 
aligned in memory. Basically, a variable of size n bytes must be aligned on a 
boundary whose address is a multiple of n bytes, up to a maximum of 8 bytes. 
For example, because a half-word occupies two bytes, its address must be a mul- 
tiple of two. 

High-level languages also impose rules about where you can assume data ap- 
pears in memory. In most cases, the language rules fort>id the same things that 
the architecture forbids. 

Occasionally, the rules conflict. For example, the ANSI X6.9-1978 standard for 
FORTRAN explicitly permits certain double-precision (8-byte) variables to lie 
on the same boundary as any real (4-byte) variable; but the RISComputer archi- 
tecture requires the double-precision variable to be aligned on an 8-byte bound- 
ary. 

The RISCompiler system supports the alignment rules imposed by each of its 
languages, even when they are more permissive than the architecture. FOR- 
TRAN, for example, deliberately avoids performing double-word load or store 
operations on certain double-precision variables. 

Some extensions such as integer*!, which are not part of any language standard, 
cause problems. For example, consider the following common block: 

common /x/i, j, k, 1 
integer*2 j, 1, q(6) 



integer*4 i, k 
equivalence (q(l), i) 

The compiler normally inserts a half-word of padding between j and k to con- 
form to alignment rules, but that prevents q(6) from lying atop /. 

Modifying your programs to align data according to the rules of the RISCom- 
puter architecture improves their performance. In the previous example, revers- 
ing the order of j and k within the common block eliminates the need for padding 
at the cost of changing the relationship between the array q and the scalar vari- 
ables. 

Rearranging the order of variables within a common block is not practical. How- 
ever, you can use certain ' 'hidden" options of the compiler system to generate 
code which tolerates misalignments but degrades performance. When uncertain 
if an object will be misaligned, the compiler generates slower code sequences. 
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You may choose one of the following three options to deal with various degrees 
of misalignment: 

-«lign8 Permits objects laiger than 8 bits to be aligned on 8-bit 

boundaries. This option requires the greatest amount of 
space; however, it is the most complete solution; 16-bit pad- 
ding is not inserted for integer*2 objects within common 
blocks. 

-alignl6 Permits objects larger than 16 bits to be aligned on 16-bit 

boundaries; 16-bit objects must still be aligned on 16-bit 
boundaries (MC68000-4ike alignment rules); 16-bit padding 
is not inserted for integer*2 objects within common blocks. 

-align32 Permits objects larger than 32 bits to be aligned on 32-bit 

boundaries; 16-bit objects must still be aligned on 16-bit 
boundaries, and 32-bit objects must still be aligned on 
32-bit boundaries. This option requires the least amount of 
space, but isn't a complete solution; 16-bit padding is in- 
serted for integer*2 objects within common blocks. 

You should also use the following option no matter which above option you 
choose, unless experimentation proves this impossible: 

-*Iign_common Assumes that all common blocks are aligned properly, even 
though objects within the common blocks may be 
misaligned. This option generates better code. Without it, 
the assembler assumes that all global objects in languages 
like C and FORTRAN may be misaligned, even though they 
appear to be aligned, because they might be aliased against 
initialized objects in other modules to force the link editor to 
misalign them. 

Pass these options specifically to the FORTRAN and assembly phases of the 
compiler system, by preceding them with -wfb, as shown: 

f77 -Wfb,-align_common,~alignl6 ... 

Two problems are not solved by these options: 

Your program must not perform I/O directly on misaligned ob- 
jects or perform any other operation which requires passing them 
by reference to runtime library routines, that have not been com- 
piled with the -align flags. 

You can circumvent this problem by copying misaligned objects 
to or from aligned temporaries before performing I/O. If the 
misaligned data is accessed only within libraries, and not by the 
kernel, you can circumvent the problem by using a runtime fix- 
up package which traps unaligned references and repairs them 
dynamically. See the unaligned(3) manual page for more infor- 
mation. 
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Keep in mind that trapping is expensive in terms of execution 
time. 



8.6 Inconsistent Common-Block Sizes 



ANSI FORTRAN requires that a named common block has the same size but not 
necessarily the same constituent variable each time it occurs in a program. How- 
ever, programs often declare only the amount needed, thus making the length of 
the common block vary. For example: 

subroutine foo 
common /gdata/ theta 

end 

subroutine bar 

common /gdata/ theta, omega, radius 

end 

The FORTRAN compiler allows uneven block sizes when possible by allocating 
the space required by the largest instance of the common block. If, however, the 
varying size causes one instance of a common block to fall below the -G thresh- 
old while another instance of the same common block is too big to fit into the 
$gp area, a problem results. At best the link editor prints error messages and the 
compiler system makes less than optimal use of the $gp area. At worst, a falsely 
small instance of the common block causes the compiler to overflow the $gp 
area. For a more detailed discussion of the -G option, see Chapter 10 in this 
manual. 

# * 
Use the link editor to report conflicting common block sizes by taking the name 

of each common block, converting it to lower case, prepending a -y, and append- 
ing an _. When you link your program, pass the above names to the link editor. 
The names must precede the first object file specified in the command line. For 
example, if the common blocks are named gdata, rsh31, and xtrnls, then type in: 

£77 -ygdata__ -yrsb31_ -yxtrnls__ *.o -o myprog 

The link editor reports the size that each common block has every time it occurs 
in an object file. The link editor also reports additional information about each 
common block; however, for common block problems, only size matters: 

stdrfl.o: definition of common rsbg31__ size 1012 
arstdr.o: definition of common rsbg31_ size 4 
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8.7 Multiple Initializations of Common blockdata 

ANSI FORTRAN 77 requires that you use a DATA statement on a named com- 
mon block only within a blockdata subprogram. An ordinary subroutine may 
initialize only local variables, not common variables; MIPS compiler system 
does not enforce this restriction. 

However, the MIPS compiler does enforce the ANSI FORTRAN 77 restriction 
which requires that you initialize each common block within exactly one subpro- 
gram A variety of messages appear when you violate this restriction, including 
error messages from the link editor citing multiply defined symbols or messages 
from earlier phases of the compiler citing illegal init or illegal space. For exam- 
ple: 

ugen: internal : line 6345 : . ./symbol. p, line 

270 

illegal inits 

To diagnose such problems, use the utility fsplit to split your program into many 
small files, with one subprogram per file. Then link the program and collect the 
multiply defined messages in a file. For each multiply-defined symbol, prepend 
a -y, and relink the program with these options preceding the list of object files in 
the command line. For example, if the link editor issues error messages for 
gdata_ and rsb31_, then relink with: 

f77 -ygdata_ -yrsb31_ *.o -o myprog 

The link editor uses the phrase definition of external data every time an object 
file initializes a symbol: 

bstimr.o: definition of external data gdata_ 
zuxeng.o: definition of external data rsb31_ 
stdrfl.o: definition of common rsbg31_ size 1012 
cmflol.o: definition of external data rsb31_ 
cmflow.o: definition of external data gdata_ 
arstdr.o: definition of common gdata_ size 4032_ 

The phrase definition of common can appear repeatedly for a particular common 
block, but the phrase definition of external data must appear only once for each 
common block. 

Once you realize that two .o files are initializing the same common block, trans- 
fer the appropriate DATA statements from one to the other (or, preferably, to a 
blockdata subprogram), then recompile and relink. 
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8.8 Endianness and integer*2 



Special problems exist for porting FORTRAN programs between big- and little- 
endian machines in addition to those discussed in Section 4.4. Although FOR- 
TRAN programs pass arguments by reference (they pass the address of the argu- 
ment rather than the argument itself), they cannot declare the formal arguments 
of a subroutine. Consider the following call: 

call foo(0.314159el, 0.628318dl, 1234, 2468) 

Clearly the first argument is type real or real*4, and the second argument is type 
double precision or real*8. But the types the third and fourth argument, which 
can be either integer*2 or integer*4, are unknown to the compiler. Thus, the 
compiler allocates four bytes for each of these variables. 

On a little-endian machine, where the address of an integer is the address of its 
low-order byte, this code works correctly even if subroutine foo expects the ar- 
guments to be integer*2, because the address is the same in either case. On a 
big-endian machine, where the address of an integer is the address of its high-or- 
der byte, this code fails: if a four-byte integer is passed to a subroutine which 
expects a two-byte integer, then the subroutine recognizes only the two upper 
bytes of the four-byte integer. 

There are two solutions: 

• If all of the formal arguments in your program are two-byte inte- 
gers, and you also wish the compiler to use two-byte integers 
wherever you have declared variables as integer rather than in- 
teger*4, then you can use the -42 option when you compile your 
program, and all literal integers ilill use only two bytes. 

• If it is not possible to use -i2, then you must use temporary vari- 
ables of type integer*2 to pass literal numbers to two-byte argu- 
ments: 

integer *2 tempi , temp2 

tempi = 1234 
temp2 = 2468 
call foo(0.314159el, 0.628318dl, tempi, temp2) 
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9 
RISCwindows 



This chapter describes the issues that you need to be aware of when porting an X 
program to RISCwindows. RISCwindows is MIPS' implementation of version 
1 1 of the X Window System. RISCwindows 3.0 contains one toolkit : the X 
toolkit intrinsics and the Athena Widgets from the X Consortium. Other compa- 
nies may have their own widget set and intrinsics; either may be incompatible 
with MIPS'. Refer to the RISCwindows Reference Manual and the Release Notes 
for more information. 



9.1 Environment 



You can program RISCwindows using either the System V or BSD version of 
RISC/os by using the command line arguments -systype sysv (the default) or 
-systype bsd43. See Chapter 3 in this manual for more information. 



9.2 System V Issues 

Headers and Defines 



9.3 BSD Issues 



If your program uses #ifdef SYSV or the header file XI 1/Xos.h, then the com- 
mand line flags -DSYSV, -DMIPS, and -4bsd are required when you compile. 
Always include these flags, since the library XI 1/Xos.h, may be included by an- 
other header file, f * 

Sigset 

X client programs should use the "sigset" family of signal management system 
calls rather than "signal", because Xlib uses "sigset" and the two cannot be 
mixed. 

Linking 

To link a program using Xlib, -Ibsd is required. For programs using the "Load" 
toolkit widget, -lmld is also required. 



If you are compiling a RISCwindow program under BSD, then you need only 
include the command line flag -DMIPS when you compile. 



9.4 Hardware Issues 

Refer to the technical reference manual that accompanies your MIPS workstation 
for specific information on the number of pixels per inch, color table, number of 
bit planes, and so on. 
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10 
PL/I 



This chapter describes the issues that you need to be aware of when porting a PL/ 
I program to a MIPS computer. You should be aware of the MIPS-PL/I imple- 
mentation for the RISCompiler System. This implementation is described in 
Part I: Programmer's Guide of the MIPS-PL/I Programmer's Guide and Lan- 
guage Reference Manual 



10.1 PL/l Extensions 



MIPS PL/I conforms to a subset of the full ANSI PL/I called subset G. Some 
extensions have also been added which are discussed in Appendix G of the 
MIPS-PL/I Language Programmer's Guide. You should be familiar with the 
differences between the full PL/I language and the G subset before you port your 
program. 



10.2 Alignment of Data in Memory 



The way data is aligned in memory from system to system is such that you will 
probably have to write a program to convert the data to a usable format. 

If your program specifies aligned data* but sends unaligned data, it causes prob- 
lems. To solve the problem, the option; , < 

-wk r -force, -unalign 

causes the compiler to treat all formal and actual arguments of type *bit* as if 
they are unaligned. This degrades the execution speed of the compiled program, 
but relaxes somewhat the requirement that formal and actual bit parameter types 
match exactly. 



1 0.3 The ADDRQ Function 



The ADDRQ function returns a pointer to the storage referenced by a specified 
variable x. The variable x must be a reference to a parameter whose correspond- 
ing argument is an array that is a member of a dimensioned structure because the 
storage of such an array is fragmented and cannot be accessed by a pointer and a 
based variable. 

On many implementations, x must not be an unaligned bit-string or a structure 
consisting entirely of unaligned bit-strings. 
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RISCompiler Components 



11.1 Introduction 



This chapter discusses considerations for debugging, programming checking, 
compiling, and link editing your programs; the chapter discusses the following 
topics: 

Debugging Procedures. You compile programs for debugging 
using the -g option of the driver command that compiles your 
program and then executing the resulting object with the dbx 
debugger. 

Programming Checking. Several program checking tools are 
available to check the correctness of your program. 

Optimization. The optimizer can significantly improve the per- 
formance of your object program. The optimizer is invoked us- 
ing one of the several -O options of the driver command. You 
should consider levels of optimization higher than the standard 
default once your program is successfully debugged. 

Link Editor Features. Severallink editor options and tech- 
niques should be considered. These options are invoked by 
either a driver command (cc, pc, f77, pll, cob) or the link editor 
Id command. 

In addition to the information provided in this chapter, you may need to refer to 
the Languages Programmer's Guide and the manual page for the driver, dbx, or 
Id in the RISC/os User's Reference Manual. 



11.2 Debugging 



This section gives a suggested procedure to follow when debugging your ported 
program. For a complete description of the debugger dbx, refer to the Lan- 
guages Programmer's Guide. 

If a program fails and you wish to use dbx to debug the failed program, do the 
following: 

1 . Recompile the program using the following compiler options: 

• the -g debugging option, which causes the compiler system to 
generate the symbol table required by dbx. 
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• the -01 optimizing options (the default), which causes the com- 
piler system to minimally optimize the resulting object. (Once 
program is successfully debugged, you may want to recompile it 
using a higher level of optimization.) 

• the -signed and -varargs options (for C programs only). 

• the -static option (for FORTRAN programs only.) 

2. Execute the program. 

3. If a segmentation fault, bus error, or other error causes the program to de- 
fault, then use dbx to isolate the problem. Do a stack trace using the dbx 
where command to locate the point of failure. 

4. If you know the approximate location of where the problem occurs, then do 
the following: 

• Use the dbx stop command to set a breakpoint just before the 
suspected problem location. 

• Use the dbx where command to display the current values con- 
tained in the pertinent variables 

• Use the dbx next or step command to incrementally execute the 
instructions after the breakpoint. Display and check the values 
of the variables as you execute each instruction. 

5. Use binary search techniques, as discussed in step 4, when you are trying to 
track down the source of corrupted data. You can also make a change to data 
or code to see what happens; understand the code before you do this. For 
example, sometimes all you need to do is to check for the symptom that re- 
sults in a problem, and bypass the code that would be executed. A classic 
example of this is programs that get segmentation faults for doing the follow- 
ing: 

if (*sp=='a') { 

} 

If sp is 0, then a segmentation fault occurs, but the code works as expected if 
it is changed to: 

if (sp && *sp === 'a' ) { 
} 
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Programming Tools 



11.3.1 Lint 



A correct program is not necessarily a portable program as it may run success- 
fully on one system, but not another. Debugging alone does not guarantee cor- 
rectness. In fact, no tool can completely guarantee the correctness of a program; 
however, a few tools can help check whether a program is operating correctly. 
These tools are appropriate to use either when porting a program from another 
system to a RISComputer, or when writing a program on a RISComputer in- 
tended to be portable to other systems. 

One such tool is Lint, a static program checker for the C programming language. 
Lint provides the sort of checking that is typically performed by compilers in 
other programming languages. Its use for C programs is highly recommended. 
See the Language Programmer's Reference for more information. 



1 1 .3.2 Subscript Range Checks 



Another tool is subscript range checking. It is not uncommon for a program to 
reference an array outside of the declared bounds. An error of this sort may go 
undetected if, for example, the location referenced exists, but is otherwise un- 
used. When the program is ported to another system the incorrect reference may 
instead access a critical location, and the program will fail to operate correctly. 

To detect subscript range errors, your program may be compiled with a special 
option that generates extra code to verify that the indexes to array references are 
within the declared bounds of the array. This option is available in Pascal and 
FORTRAN. It is the default in Ada. For C, the language and its style of use, 
does not make subscript range checking pseM, so no compiler option is pro- 
vided. 

A Pascal program compiled without subscript range checking would run: 

% pc -q ~o example example. p 

However, if you compiled the same program with subscript checks, you would 
receive a subscript error during run time. 

% pc -c ~q -o example 
% . /example 

Trace/BPT trap (core dumped) 

At this point, you could use dbx(l) to locate the source line with the subscript 
range error. 
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The -C compile option also works for a FORTRAN program. Older FORTRAN 
programs require some modification to work with subscript range checking 
turned on. It was once common in FORTRAN to declare array parameters to 
have dimension 1 when the actual size was passed as a separate parameter: 

subroutine zero (a, n) 

real a(l) 

do 10 i = l f n 

10 a(i) - 

end 

In FORTRAN 77 the declaration could correctly be written as: 

real a(n) 

or 

real a(*) 
if the array size is not passed as a parameter. 
1 1.3.3 Dynamic Storage Allocation 

Just as programs sometimes reference outside the bounds of an array, a common 
error is to call a dynamic storage allocator and reference outside of the allocated 
block. Since the compiler often does not know the size of the block when a 
pointer based reference is made, it cannot generate code to verify the access, as 
with subscript range checking. However, a special version of the standard dy- 
namic storage allocation routines mallocO, freeO, 2 nd reallocO called tnalloc- 
check(l) is available that checks for incorrect uses of dynamic storage. Add 
-Imalloccheck to your link command line to use this version. It checks for the 
following: 

Writing beyond an allocated block. A common error is to write beyond the 
end of an allocated block. The malloccheck allocator allocates extra space both 
before and after the block it returns to you and initializes this space to special bit 
patterns. A write outside the block will usually affect these pattern words. When 
the block is freed, the pattern words are checked, and if modified a warning is 
given. 

Freeing a block twice- Another error is to free a block twice. Malloccheck does 
not re-use storage after it is freed, but instead simply marks it as such. A second 
free to the same block generates a warning. 

Referencing a block after it is freed Another error is to reference a block after 
it is freed. This often works because the freed storage is not immediately re- 
used. Malloccheck's free routine overwrites the data when it is free, which usu- 
ally causes subsequent references to return unexpected results, leading to a de- 
tectable program failure later. 

Initializing allocated storage to zero. Some programs inadvertently assume the 
allocated storage is initialized to zero, even though the standard mallocQ and 
freeO routines do not guarantee this. Malloccheck initializes the allocated stor- 
age to non-zero so that such assumptions lead to program failure. 
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Malloccheck's primary checking is done when blocks are freed. An error may go 
undetected if a block is never freed, or if the error occurs after it is freed. Also, 
an inconsistency detected by free may be difficult to trace to an error made long 
before. For all of these reasons, malloccheck provides the malloc^statusQ sub- 
routine, which checks the entire dynamic storage allocation area. Calls to mal- 
loc_status() can be inserted in the program as necessary to locate the source of an 
error. During program development a single call to mallocjstatusO at the end of 
the program is useful. The argument to malloc_statusO specifies the level of 
checking: 

malloc^status (0) ; 

checks for errors and prints some summary statistics. A level of 1 : 

malloc_status (1) ; 

checks for errors and prints some summary statistics and lists all blocks that re- 
main in use. This is useful for finding blocks that the program failed to free. 
Failure to free storage can lead to eventual memory exhaustion and program fail- 
ure on a large run. 

For example if you compile and link example.c with the default allocator, exam- 
ple runs: 

% cc -g -o example example.c 

However if you link example with malloccheck, it it finds the following errors. 

% cc -g -o example example.c -lmalloccheck 
% . /example 

Error: check word at 10003834 preceding block at 

10003838 bashed from 87cccccl to 87cccc01. 

Error: pad byte at 100038b8 of block at 100038b0 

bashed from 5a to 02. 

Error: freeing block at 100038d0 again. 

Error: check word at 100038ec following block at 

100038f8 bashed from 87ccccc2 to 03ccccc2. 

Error: check word at 10003908 following block at 

10003910 bashed from 87ccccc3 to 04ccccc3. 

Error: trailer size word at 10003934 for block at 

10003928 bashed to 05000004. 

Error: realloc (NULL, 20). 

Error: realloc of free block at 10003968. 

Warning: malloc (268435456) . Will return NULL. 

Warning: sbrk (8388640) failed. Will return NULL. 

mallocjstatus (1) : 

Error: check word at 10003834 preceding block at 

10003838bashed from 87cccccl to 87cccc01. 
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11.4 Optimization 



The MIPS optimizer is vulnerable to human error, for example, incorrectly speci- 
fying the size of a variable or the nature of a formal argument In the following 
Pascal code, the optimizer may move the (f statement to precede the loop, since 
nomejohonger is declared to receive only one character, therefore nome[5] can- 
not change during the loop: 

type 

array5 - packed array [1 . . 5] of char; 

var 

i: integer; 

name: array 5; 
procedure name_changer (var c: char); extern; 

for i :== 1 to 10 do 
begin 

if name [5] >' 9' then goto 5; 
name_changer (name [ 1 ] ) ; 
writeln(name) ; 
end; 

This assumption is true if nomejchanger is coded in Pascal and the formal argu- 
ment agrees with the actual argument If it is coded in C, and the formal argu- 
ment is char *c, then nomejchanger may alter name[5] during the loop. To 
solve this problem be specific. Don't specify var c: char if it is actually var c: 
arrays from the point of view of the external procedure. 

Similar problems arise in FORTRAN programs that assume declaring a formal 
argument or common block to be an array of one element is the same as declar- 
ing it specifically: 

common /x/ ary(l) 

call matset (ary) 

If a common declaration in another program unit specify ary(100), then the vari- 
able ary becomes 100 elements large when you link the program; but in this par- 
ticular section, the optimizer behaves as if the variable had only one element 
This problem can be solved as follows: 

* Use consistent common declarations. 

• Use an ANSI FORTRAN 77 declaration in the form of integer 
parm(*) rather than the traditional trick of integer parm(l) when 
the size of a formal parameter may vary. 
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. __ ___ Programming Tools 

11.5 The Link Editor 

This section describes the special features of the link editor that you should be 
aware of when porting a program. For information on the link editor and its li- 
braries, refer to the ld(l) manual page in the RISC/os (UMIPS) User's Reference 
Manual. 

11.5.1 Tho-G option 

The RISCompiler system sets up one register called gp to point to a 64Kbyte 
block of global memory that can be addressed in half the number of instructions 
required for a normal global access. It allocates by default to the gp area any 
global variable up to a maximum size of eight bytes. You can change the default 
size using the driver -G option (see the Language Programmer's Guide or the 
associated compiler manual page in the User's Reference Manual). 

There are three kinds of ^-related problems: 

1. The gp area overflows because gp-relative data doesn't fit into 
its allocated 64Kbytes of memory. 

If this problem occurs, the link editor prints a prediction of the 
best value to use as a maximum size in the -G option. The "best 
value" places as many global variables as possible into the 
64Kbyte area to improve performance, but excludes enough vari- 
ables to prevent the area from overflowing. 

However, the "best value" is nfkely a prediction and may not 
produce successful results. To make sure that no gp area over- 
flow occurs, and to produce an executable object immediately, 
note the best value provided by the link editor, and then recom- 
pile and relink your program using the -G option. You can 
then move that copy to a safe place and recompile and relink us- 
ing the recommended best value. 

If your program does not fail, but you want to improve perform- 
ance, then use the -bestGnum option. This option causes the 
link editor to predict a best value. Recompile and relink with the 
new value. However, you should first debug the program at the 
default setting, save a working copy, and then experiment with 
the best number prediction. 

2. A variable larger than the maximum specified size is in the gp 
area. 

This problem can happen when two program modules disagree 
about the data type of an object. For example, one program sees 
the data as a small variable and addresses it within the gp area, 
and the other sees it as a large variable. 
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The link editor retains the larger size of the variable, when possi- 
ble, and places it into the gp area with a warning error message. 
This may cause the gp area to overflow. If the gp area over- 
flows, then use the — G option or (preferably) reconcile the 
conflicting declarations so as to retain the advantages of using 
the gp area for other variables. 

Sometimes the link editor cannot put the large variable into the 
gp area because it is a synonym for some other object that cannot 
be addressed relative to the gp register. If this is the case, you 
must reconcile the conflicting declarations. For example, sup- 
pose one module defines an object as a function, which cannot 
be addressed relative to the gp register 

int f oo ( ) ; 

bar (foo) ; 

and another defines it as a small data item: 

int foo, *ptr; 

ptr = &foo; 

Most inconsistently sized declarations are caused by a violation 
of the ANSI standard with regard to FORTRAN common blocks. 
See Chapter 8 in this manual for details. 

3. The link editor believes that the gp register isn't initialized. 
This problem can occur when you use your own start up code, 
rather than the runtime startup code in crtO.o or crtLo provided 
when a RISCompiler driver (cc, f77, pc, cobol, or pll) links 
your program. 

The runtime startup code loads a link editor-defined symbol 
called jgp into the gp register. If you use your own startup code 
instead, load _gp into some register ($0 is acceptable) even if 
you load gp with some other value that you have calculated 
yourself; otherwise, the link editor issues an error message. 
Two details may help you in reconciling inconsistendy sized declarations: 

1 . If a common variable is declared but not referenced in a module, 
then the compiler allocates it outside the $gp area regardless of 
its size. This allocation reduces possible problems. Therefore, 
you should explicitly initialize unreferenced variables to zero, to 
ensure that they are placed within the $gp area. 

2. In C, you can force a scalar variable to be referenced as if it lay 
outside the $gp area by declaring it to be an array of unspecified 
size and referencing the first element (for example, mt[] j; and 
j[0] rather than intj and/). 
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1 1 .5.2 Forcing Library Extractions 



The RISCompiler system link editor opens and searches only one library at a 
time in the order you specify. This can cause problems as the following example 
shows. Suppose you try to link a program p.o with two libraries, ll.a and 12 .a, as 
follows: 

cc -o p p.o 11. a 12. a 

The components that the program and libraries contain or need are: 



File/Library 


Contains 


Importe/Exports 


p.o 




imports 12proc 


ll.a 


11.0 


export llproc, import Bproc 


12.a 


12.o 


export 12proc, import 1 1 proc 


12.a 


16.o 


exports Bproc 



When the program is compiled: 

1. The link editor sees that it needs to import llproc for p.o 

2. It searches 11 .a for llproc, and does not find it 

3. The link editor closes 11 a and opens 12a 

4. It finds the llproc but cannot find the llproc because 11 .a is closed 

If you specify 11. a and 12. a in the opposite order, then the link editor fails to ob- 
tain Bproc. 

The standard UNIX solution to this problem in which you assemble a file 
kludges containing: & 

.globl llproc 

and link kludge.o prior to ll.a to import llproc does not work on the RISCom- 
piler system. The RISCompiler assembler notices that kludges does not really 
use llproc, and as an optimization removes the request to import it To solve this 
problem, edit kludges so that it defines llproc: 

.extern llproc 
♦ data 
.word llproc 

Simpler solutions are to: 

1. Correct the problem on the command line by having the link editor search the 
11 .a library twice: 

cc -o p p.o ll.a 12. a ll.a 

2. Extract the object file and directly include it in the command line. 

ar x ll.a 11. o 

cc -o p p.o ll*o ll.a 12. a 

MIPS RISCompiler Porting Guide 11-9 



Chapter 11 



1 1 .5.3 The Semantics of a Library Search 

Some programs assume that the link editor searches linearly within a library for 
symbols that it wishes to import The RISCompiler link editor libraries use a 
hashed symbol table for faster Unking, so the order in which .o files are added to 
a .a file is insignificant 

The link editor does not consider a "common" declaration to be a request to im- 
port every module that issues an identical "common" declaration. For example, 
a declaration of int errno in a C-coded main program does not cause the link edi- 
tor to import every module that similarly declares int errno; those modules are 
imported only if they specifically export some symbol that your program specifi- 
cally imports using a function definition or initialized data definition 

However, a "common" in the library can satisfy an import request without actu- 
ally adding the library module to the program. For example, if your main pro- 
gram declares extern int errno, the occurrence of int errno in a module foo.o in 
the library would create a common ' 'int errno• * in the linked program, without 
necessarily adding foo.o to the linked program. This rather exotic behavior 
makes our link editor compatible with the one provided by the standard BSD 
UNIX distribution. 

1 1 .5.4 Libraries Versus Object Files 

If you want to bundle together a group of infrequentiy-changed object files be- 
cause it is more convenient to specify a single name when you link, it is faster to 
use Id -r to bundle them into a .o file than to use or -r to add them to a .a file. 
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